Warning in transformCI(ci2, log): Only monotonic functions are meaningful for
transforming confidence intervals.
[1] 2.659343
c
# Expected not allowedwaldCI(level =0.95, mean =10, sterr =-1)
Error in validObject(.Object): invalid class "waldCI" object: @upper bound 8.04003601545995 must be strictly greater than @lower bound 11.9599639845401
waldCI(level =0.95, lb =5, ub =4)
Error in validObject(.Object): invalid class "waldCI" object: @upper bound 4 must be strictly greater than @lower bound 5
waldCI(level =0.95, lb =5, ub =5)
Error in validObject(.Object): invalid class "waldCI" object: @upper bound 5 must be strictly greater than @lower bound 5
waldCI(level =1, mean =10, sterr =1)
Error in validObject(.Object): invalid class "waldCI" object: @confidence level = 1 is not a valid confidence level (must be 0 < level < 1).
Error in validObject(object): invalid class "waldCI" object: @upper bound 20 must be strictly greater than @lower bound 25
Problem 3
a
There are one significant spike and about 4 minor spikes. I queried chatgpt for the grammar of plotly and revised them.
covid <-read.csv("us-states.txt")library(dplyr)
Attaching package: 'dplyr'
The following object is masked _by_ '.GlobalEnv':
contains
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(plotly)
Warning: package 'plotly' was built under R version 4.3.3
Loading required package: ggplot2
Warning: package 'ggplot2' was built under R version 4.3.3
Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':
last_plot
The following object is masked from 'package:stats':
filter
The following object is masked from 'package:graphics':
layout
national_cases <- covid %>%# Convert the date column to the Date formatmutate(date =as.Date(date)) %>%# Group data by dategroup_by(date) %>%# Summarize to calculate the national daily average casessummarise(National_Cases_Avg =sum(cases_avg, na.rm =TRUE),.groups ='drop' )# 2. Create the Interactive Plot using plot_ly()covid_spikes_plot_plotly <- national_cases %>%# Start the plot with plot_ly, defining x, y, type, mode, and line styleplot_ly(x =~date, y =~National_Cases_Avg,type ='scatter', mode ='lines',line =list(color ='darkred', width =1)) %>%layout(title =list(text ="U.S. National COVID-19 Waves<br><sup>7-Day Average New Cases</sup>"),xaxis =list(title ="Date",dtick ="M6", tickformat ="%Y %b"# Display format: 2020 Jan ),yaxis =list(title ="Avg. New Cases (National)") )# 3. Print the interactive chartcovid_spikes_plot_plotly
b
I calculate the total cumulative covid case rate per 100K population for every state and ranks them from highest to lowest to get the highest and lowest overall rates per population. The main differences are the Maine has a huge spike in 2022, whereas Rhode Island experience minor spikes at the same time.
state_ranking <- covid %>%mutate(date =as.Date(date)) %>%group_by(state) %>%summarise(Cumulative_Rate =sum(cases_avg_per_100k, na.rm =TRUE), .groups ='drop') %>%# Arrange states in descending order of the cumulative ratearrange(desc(Cumulative_Rate))# Get the names of the highest and lowest ranked stateshighest_state_name <-head(state_ranking$state, 1)lowest_state_name <-tail(state_ranking$state, 1)# Filter the main data to include only the two states for comparisoncomparison_data <- covid %>%filter(state %in%c(highest_state_name, lowest_state_name)) %>%mutate(date =as.Date(date))trajectory_plot_plotly <- comparison_data %>%# Start the plot with plot_lyplot_ly(x =~date, y =~cases_avg_per_100k, color =~state, type ='scatter', mode ='lines') %>%# Set the chart layout (titles and labels)layout(title ="COVID-19 Trajectory",xaxis =list(title ="Date"),yaxis =list(title ="7-Day Avg. Cases per 100k") )trajectory_plot_plotly
c
I referenced the solution of Problem Set 5 and chatgpt for plotly grammar. We can see from this messy plot that the first spike is approximately from 2020-03-21 to 2020-05-20.
library(lubridate)
Attaching package: 'lubridate'
The following objects are masked from 'package:base':
date, intersect, setdiff, union
covid20 <- covid %>%mutate(Date =as.Date(date)) %>%filter(year(Date) ==2020) %>%select(state, date, cases_avg_per_100k)trajectory_plot_no_legend <- covid20 %>%plot_ly(x =~date, y =~cases_avg_per_100k, split =~state, type ='scatter', mode ='lines',showlegend =FALSE) %>%layout(yaxis =list(title ="COVID cases per 100k") )trajectory_plot_no_legend
I restrict the data to have cases_avg_per_100k to be larger than 15 and zoom into March, April, and May. We can know from the plot that the top 5 states are New York, New Jersey, Louisiana, Gaum, and Connecticut.